Labeler agreement in phonetic labeling of continuous speech

نویسندگان

  • Ronald A. Cole
  • Beatrice T. Oshika
  • Mike Noel
  • Terri Lander
  • Mark A. Fanty
چکیده

This paper analyzes inter-labeler agreement of label choice and boundary placement for human phonetic transcriptions of continuous telephone speech in diierent languages. In experiment one, English, German, Mandarin and Spanish are labeled by uent speakers of the languages. In experiment two, German and Hindi are labeled by linguists who do not speak the languages. Experiment two uses a somewhat ner phonetic transcription set than experiment one. We compare the transcriptions of the utterances in terms of the minimum number of substitutions, insertions and deletions needed to map one transcription to the other. Native speakers agree on the average 67.52% of the time at the nest level of labeling, including diacritics. Non-native linguists agree 34.41% of the time. The implications of the results are discussed for evaluation of phonetic recognition algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-language Speech Database: Creation and Phonetic Labeling Agreement

The focus of the paper is the evaluation of inter-labeler reliability on broad phonetic transcriptions when la-belers do not necessarily know the language they are labeling. We provide an analysis of label disagreements, presenting results from six languages, Spanish, and Vietnamese with a total of 2 minutes of continuous labeled speech. Labeler agreement across languages ranges from 41 percent...

متن کامل

Labeler agreement in transcribing korean intonation with K-toBI

This paper reports labeler agreement in the transcription of Korean prosody using Korean ToBI (K-ToBI) [9]. Twenty utterances representing five different types of speech were produced by 18 speakers and transcribed by 21 labelers differing in their levels of experience with K-ToBI. Following the stringent metric used for English ToBI evaluation [14,12], consistency was measured in terms of the ...

متن کامل

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...

متن کامل

Efficient Hierarchical Labeler Algorithm for Gaussian Likelihoods Computation in Resource Constrained Speech Recognition Systems

This paper presents a new time/memory-efficient algorithm for the evaluation of state likelihoods in an HMM-based speech recognizer where the states are modeled by Gaussian Mixtures. We first present a fast hierarchical labeling scheme and then an improved version, which is specifically geared toward use in recognizers that use asynchronous search (e.g., stack search) as opposed to synchronous ...

متن کامل

Automatic Labeling of Corpora for Speech

One of the bottlenecks in the development of text-to-speech synthesizers based on segment concatenation is the need for large, segmented and labeled corpora. Consequently, as manual segmentation and labeling is a tedious and time consuming task, there is a strong demand for automatic labeling systems which can label speech from many languages. Several systems have been proposed already, but the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994